Search CORE

9 research outputs found

Topology-Aware Parallelism for NUMA Copying Collectors

Author: A Muddukrishna
Lokesh Gidra
Mohammad Dashti
Takeshi Ogasawara
Xianglong Huang
Y Chicha
Yefim Shuf
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Abstract. NUMA-aware parallel algorithms in runtime systems attempt to improve locality by allocating memory from local NUMA nodes. Re-searchers have suggested that the garbage collector should profile mem-ory access patterns or use object locality heuristics to determine the tar-get NUMA node before moving an object. However, these solutions are costly when applied to every live object in the reference graph. Our earlier research suggests that connected objects represented by the rooted sub-graphs provide abundant locality and they are appropriate for NUMA architecture. In this paper, we utilize the intrinsic locality of rooted sub-graphs to improve parallel copying collector performance. Our new topology-aware parallel copying collector preserves rooted sub-graph integrity by moving the connected objects as a unit to the target NUMA node. In addition, it distributes and assigns the copying tasks to appropriate (i.e. NUMA node local) GC threads. For load balancing, our solution enforces locality on the work-stealing mechanism by stealing from local NUMA nodes only. We evaluated our approach on SPECjbb2013, DaCapo 9.12 and Neo4j. Results show an improvement in GC performance by up to 2.5x speedup and 37 % better application performance

CiteSeerX

Crossref

Enlighten

A characterization of a java-based commercial workload on a high-end enterprise server

Author: Ian M. Steiner
Yefim Shuf
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Multiple Page Size Support in the Linux Kernel

Author: Hubertus Franke
Simon Winwood
Yefim Shuf
Publication venue
Publication date: 01/01/2002
Field of study

The Linux kernel currently supports a single user space page size, usually the minimum dictated by the architecture. This paper describes the ongoing modifications to the Linux kernel to allow applications to vary the size of pages used to map their address spaces and to reap the performance benefits associated with the use of large pages

CiteSeerX

Exploiting Prolific Types for Memory Management and Optimizations

Author: Jaswinder Pal Singh
Manish Gupta
Rajesh Bordawekar
Yefim Shuf
Publication venue: ACM Press
Publication date: 01/01/2002
Field of study

In this paper, we introduce the notion of prolific and non-prolific types, based on the number of instantiated objects of those types. We demonstrate that distinguishing between these types enables a new class of techniques for memory management and data locality, and facilitates the deployment of known techniques. Specifically, we first present a new type-based approach to garbage collection that has similar attributes but lower cost than generational collection. Then we describe the short type pointer technique for reducing memory requirements of objects (data) used by the program. We also discuss techniques to facilitate the recycling of prolific objects and to simplify object co-allocation decisions

CiteSeerX

Crossref

Creating and preserving locality of java applications at allocation and garbage collection times

Author: Yefim Shuf
Manish Gupta
Hubertus Franke
Andrew Appel
Jaswinder Pal Singh
Publication venue: ACM Press
Publication date: 01/01/2004
Field of study

This find is registered at Portable Antiquities of the Netherlands with number PAN-0003332

Crossref

Electronic Archiving System

Creating and Preserving Locality of Java Applications at Allocation and Garbage Colelction Times

Author: Andrew Appel
Hubertus Franke
Jaswinder Pal Singh
Manish Gupta
Yefim Shuf
Publication venue: ACM Press
Publication date
Field of study

The growing gap between processor and memory speeds is motivating the need for optimization strategies that improve data locality. A major challenge is to devise techniques suitable for pointerintensive applications. This paper presents two techniques aimed at improving the memory behavior of pointer-intensive applications with dynamic memory allocation, such as those written in Java. First, we present an allocation time object placement technique based on the recently introduced notion of prolific (frequently instantiated) types. We attempt to co-locate, at allocation time, objects of prolific types that are connected via object references. Then, we present a novel locality based graph traversal technique. The benefits of this technique, when applied to garbage collection (GC), are twofold: (i) it improves the performance of GC due to better locality during a heap traversal and (ii) it restructures surviving objects in a way that enhances locality. On multiprocessors, this technique can further reduce overhead due to synchronization and false sharing. The experimental results, on a well-known suite of Java benchmarks (SPECjvm98 [26], SPECjbb2000 [27], and jOlden [4]), from an implementation of these techniques in the Jikes RVM [1], are very encouraging. The object co-allocation technique improves application performance by up to 21% (10% on average) in the Jikes RVM configured with a non-copying markand -sweep collector. The locality-based traversal technique reduces GC times by up to 20% (10% on average) and improves the performance of applications by up to 14% (6% on average) in the Jikes RVM configured with a copying semi-space collector. Both techniques combined can improve application performance by up to 22% (10% on average) in the Jikes RVM configured with a non..

CiteSeerX

Characterizing the memory behavior of Java workloads

Author: Alpern B.
Barisone A.
Bowers K. R.
Culler D.
Gosling J.
IBM Corp
Jaswinder Pal Singh
Manish Gupta
Mauricio J. Serrano
Mowry T.
Radhakrishnan R.
Shuf Y.
Standard
Yefim Shuf
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref